API for secure automated file uploads

A robust and secure API for uploading files is a central component for continuously and automatically supplying an AI bot with new content. This interface makes it possible to upload documents (e.g. PDFs, text files, CSVs) directly to the backend, where they are then automatically processed and imported into the chatbot's knowledge database.

Secured by JSON Web Token (JWT)

To ensure that only authorized and authenticated systems or users can upload content, the API is secured by JSON Web Token (JWT).

  • Authentication: Before an upload can take place, the client must send a valid JWT in the header of the request (usually in the Authorization: Bearer <token> field).
  • Authorization: The API validates the token received (checking the signature, expiry date and claims). The request is only accepted if the token is valid and has the required authorizations (scopes) for the upload.
  • Advantages: JWTs offer a stateless and scalable solution, as the authorization information is contained directly in the token and no server-side session check is required for each request.

Automated import functionality.

Once the file has been successfully and securely uploaded, the backend logic takes over the rest of the process:

  • Storage and validation: The file is temporarily stored and checked for file type, size and integrity.
  • Parsing and extraction: The contents of the uploaded documents are read in a structured manner (e.g. text extracted from a PDF, rows and columns analyzed from a CSV).
  • Transformation: The raw data is transformed into a RAG format. This includes the conversion of the information into smaller blocks (chunks), the normalization of metadata and the creation of vector representations (embeddings).
  • Database import: The processed data is inserted into the AI bot's RAG database.

This automated workflow ensures that the AI Bot always works with the most up-to-date information without the need for manual intervention to maintain the data. The JWT protection guarantees the necessary security and trustworthiness of the data source.